The dataset is an archived Twitter data for user @dog_rates, also known as WeRateDogs, the archived datasets dates back to 1st of August, 2017. it contains ratings on people's dogs with a humorous comments about the dog. These ratings almost always have a denominator of 10, with over 4 million followers and has received international media coverage. Three(3) different datasets was obtained during the course of analysis, which comprises of; twitter archive: The WeRateDogs Twitter archive contains basic tweet data for all 5000+ of their tweets, twitter API - tweet Json: Back to the basic-ness of Twitter archives: retweet count and favorite count are two of the notable column omissions and image prediction: a table full of image predictions (the top three only) alongside each tweet ID, image URL, and the image number that corresponded to the most confident predictions. At the course of analysis, Insights like:
These were all noted as point of interest. Visualization was carried out using bar graph for exploratory analysis as would be depicted subsequently. From the visualization of dog stage proportion, it was realized pupper had the highest count, followed by doggo and then the least was floofer out of the 4 dog stages covered during the study period as it can be seen below.
It can be further stated the need to clean stage column with 2 dog stage as signified above. Mostly, however, pupper refers to when a dog is only just a puppy.
Depending on the breed size, puppies vary in weight anywhere from one to 23 pounds. Thanks to lots of bonding time with their mom and littermates, a puppy is typically mature enough to find its furever-home as early as eight weeks old. However, that same puppy is still far from entering adulthood.
Golden Retrievers are one of the most popular dog breeds for good reason. They are intelligent, loyal, easy to train, and very affectionate. They make wonderful family dogs because they are great with young children and other dogs.
Golden Retrievers are medium-sized sporting dogs that weigh on average 55-75 pounds, with females weighing on the lower end of this range. Their height can range from 21-24 inches tall. They have a broad head, short ears, deep chest, and very muscular build. Golden Retriever appeared to be the most predicted dog predictions in the WeRateDogs Twitter archive as shown in the graph below. Golden retriever was seen as the most popular dog breed with count of 135, followed by labrador retriever with count of 91.
The Japanese Chin (Japanese: 狆, chin), also known as the Japanese Spaniel, is a dog known for its strabismus of the eyes. Being both a lap dog and a companion dog, appears to be one of the least predicted dog breed in the image prediction for the Twitter archive.
The next insight obtained was the fact, Iphone appears to be most prevelent amongst twitter users, with TweetDeck been the least favourable choice. A graphical view of this depiction can be seen in the graph below.
What is the most common factor to favourite counts?, Is there a relationship between the number of retweets and favorites dog tweets?, this prompted the need for investigating the relationship between favorite count and retweet count, where it appear to have a positive impacts of about 91% degree of relationship. The graph is shown in the next figure below.
The above graph as earlier explained, signifies a positive relationship. Does this now mean, dog breed with higher retweets are the most favorites?, the next table shows a perfect explanation as investigated from the twitter archive datasets.
The Labrador Retriever or simply Labrador is a British breed of retriever gun dog. It was developed in the United Kingdom from fishing dogs imported from the colony of Newfoundland, and was named after the Labrador region of that colony. This breed appeared to be the most retweeted but not necessarily the most favorite count.
The Lakeland Terrier is a dog breed, which takes its name from its place of origin, the Lake District in England. The dog is a small to mid-size member of the Terrier family. While independent in personality, it interacts well with owners and all family members, it appears to be the breed with most favourite count but not the most retweets.
English setter on the other hand appears to be the most lowest in both retweets and favourite count.
Out of the initial 2356 rows and 17 columns that was obtained at the initial stage of analysis, the resulting datasets after cleaning operations was 1463 rows and 14 columns, which shows the extent of our dirty and messy data contained in our initial dataset. 8 quality issues and 2 tidiness issues was addressed during the course of analysis and all resultant solutions was merged to form a master dataset called "twitter_archive_master.csv". Three (3) insights was obtained from the refined dataset to observe pupper been the highest dog stage, golden retriever been the most predicted dog breed and iphone as the most popular tweet source amongst users for the study period been investigated. Limitations encountered was, not been able to have access to the algorithm used for the dog prediction, which would have further enhanced the quality of analysis and also the rejection of twitter developer account application by twitter team.